Multiple data hazards detection and resolution unit

ABSTRACT

Order indication logic can be recycled for at least two different data hazards, thus reducing the amount of processor real estate consumed by data hazard resolution logic. The logic also allows a single priority picker to be utilized for coloring without the cost of additional pipeline stages. A single priority picker can be utilized to identify memory operations for performing RAW bypass and for resolving OERs. For instance, a data hazard resolution unit resolves at least two different data hazards between resident memory operations and incoming memory operations with a set of logic that indicates order of the resident memory operations relative to the incoming memory operations. The indicated order corresponds to the data hazard being resolved. The data hazard resolution unit includes a priority picker to select one of the indicated resident memory operations for either data hazard.

CROSS REFERENCE TO RELATED APPLICATION

This application is related to commonly owned, co-pending U.S. patentapplication Ser. No. 10/747,584, filed Dec. 29, 2003, naming asinventors Krishna M. Thatipelli and Balakrishna Venkatrao, entitled“Efficient Read After Write Bypass,” which is incorporated herein byreference in its entirety.

BACKGROUND

1. Field of the Invention

The present invention relates to the field of computers. Morespecifically, the present invention relates to computer architecture.

2. Description of the Related Art

Out-of-order processors issue and execute instructions out-of-order togain performance benefits. However, data dependencies exist betweencertain instructions and require preservation of those dependencies.Violation of those data dependencies results in a data hazard. Twoparticular data hazards are read-after-write (RAW) and 2)write-after-read (WAR), also referred to as overeager load (OEL).

A store instruction writes data into a designated memory location. Aload instruction reads data from a designated memory location. If thestore and load instructions are accessing the same memory location andthe store instruction is older than the load instruction (i.e., thestore instruction precedes the load instruction in the program sequencethat includes these instructions), then the load instruction may dependon the store instruction, assuming there are no other intervening storeinstructions. An additional factor that affects dependency betweeninstructions includes the size of the data being written or read. Sincethe store instruction requires more time than a load instruction, thereis a possibility that the load instruction will access the memorylocation before the store instruction completes. If so, then the loadinstruction will access stale data. To resolve this RAW data hazardwithout losing the benefit of out-of-order processing, RAW bypass isperformed. The data being written by the store instruction is passed tothe load instruction before the store instruction actually writes it tothe memory location.

An OEL hazard occurs when a processor issues and executes a loadinstruction that depends on an older store instruction before the storeinstruction is issued. Again, the load instruction will read stale databecause the store instruction has not written to the memory location. Toavoid this data hazard, the load instruction is rewound (i.e., flushedfrom the execution pipeline to start over) and a dependency is imposedon the load instruction so that it does not issue until after the storeinstruction. In addition, some processors utilize a “coloring” techniqueto identify instructions with data dependencies in order to impose thosedata dependencies.

A conventional processor resolves these two data hazards with separatelogic. The RAW logic identifies issued store instructions that are olderthan an issued load instruction. The OEL logic identifies issued loadinstructions that are younger than an issued store instruction. Theprocessor utilizes two separate priority pickers for performingoperations to resolve the two data hazards. The separate logic andseparate priority pickers occupy valuable area, which becomes even morevaluable as processor designs evolve to incorporate more power andfunctionality.

SUMMARY OF THE INVENTION

It has been discovered that the same logic and a single priority pickercan be utilized to identify memory operations for resolving both RAWdata hazards and overeager read data hazards. Utilizing a singlepriority picker and the same logic to handle two different data hazardsreduces area consumed by data hazard resolution logic, thus making thevaluable processor area available for other purposes. In addition, logicthat allows a single priority picker to be utilized provides thebenefits of coloring without the cost of additional pipeline stages. Thelogic selects resident memory operations with a first ordercharacteristic relative to an incoming memory operation for a first datahazard. Manipulating order information of the incoming memory operationallows the same logic to indicate memory operations with a second ordercharacteristic relative to an incoming memory operation for a seconddata hazard.

These and other aspects of the described invention will be betterdescribed with reference to the Description of the PreferredEmbodiment(s) and accompanying Figures.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention may be better understood, and its numerousobjects, features, and advantages made apparent to those skilled in theart by referencing the accompanying drawings.

FIG. 1 depicts an exemplary data hazard resolution unit with logic forindicating memory operations with an order characteristic according torealizations of the invention.

FIG. 2 depicts exemplary order information according to realizations ofthe invention.

FIG. 3 depicts a flowchart for indicating an age characteristicaccording to realizations of the invention.

FIGS. 4A-4B depict exemplary priority pickers according to realizationsof the invention. FIG. 4A depicts an exemplary priority picker thatselects a resident memory operation with a particular ordercharacteristic according to realizations of the invention. FIG. 4Bdepicts another exemplary priority picker that selects a resident memoryoperation for data hazard resolution operations according torealizations of the invention.

FIG. 5 depicts OER data hazard resolution operations according torealizations of the invention.

FIG. 6 depicts coloring of memory operations for OER data hazardresolution according to realizations of the invention.

FIG. 7 depicts selection of read type memory operations for OER rewindaccording to realizations of the invention.

FIG. 8 depicts an exemplary memory disambiguation buffer communicatingrewind to other units according to realizations of the invention.

FIG. 9 depicts exemplary sub-blocks of a memory disambiguation bufferaccording to realizations of the invention.

FIG. 10 depicts an exemplary computer system according to realizationsof the invention.

The use of the same reference symbols in different drawings indicatessimilar or identical items.

DESCRIPTION OF THE PREFERRED REALIZATION(S)

The description that follows includes exemplary systems, methods,techniques, instruction sequences and computer program products thatembody techniques of the present invention. For instance, realizationsof the invention can be implemented in one or more units to determinedata hazards and/or resolve data hazards, such as a collection of queuesand logic (e.g., a load store queue, a memory disambiguation buffer,etc.). However, it is understood that the described invention may bepracticed without these specific details. In other instances, well-knownprotocols, structures and techniques have not been shown in detail inorder not to obscure the invention.

FIG. 1 depicts an exemplary data hazard resolution unit with logic forindicating memory operations with an order characteristic according torealizations of the invention. The data hazard resolution unit 100(e.g., a load store queue, a memory disambiguation buffer, etc.)includes an order qualifier block 101, a memory operation type qualifierblock 103, and a memory operation order characteristic indication block105. The order qualifier block 101 receives incoming memory operationorder information (e.g., index information, age information, schedulinginformation, wrapping information, etc.) and incoming memory operationtype information (e.g., indication of read type memory operation, writetype memory operation, etc.). The order qualifier block 101 qualifiesthe incoming memory operation order information based on the receivedincoming memory operation type information. If the incoming memoryoperation type information indicates that an incoming memory operationis of a particular type, then the order qualifier block 101 modifies theincoming memory operation order information. Otherwise, the incomingmemory operation order information remains unchanged as it flows to thememory operation order characteristic indication block 105.

FIG. 2 depicts exemplary order information according to realizations ofthe invention. An operation scheduling unit 201 includes an array ofmemory operations and their corresponding order information. Theexemplary operation scheduling unit of FIG. 2 depicts a 64 entryoperation scheduling unit 201 with 6-bits of order information. The mostsignificant bit of the operation scheduling unit 201 order informationis a wrap bit from the perspective of a memory disambiguation buffer203. The memory disambiguation buffer 203 has 32 entries, hence 5-bitsof order information.

The operation scheduling unit 201 issues a memory operation from entry100001 of the operation scheduling unit 201 to the memory disambiguationbuffer 203. The issued memory operation, or incoming memory operationfrom the perspective of the memory disambiguation buffer 203, becomesentry 00001 in the memory disambiguation buffer 203. The order, or age,of the incoming memory operation is determined from both the memoryoperation's order information in the memory disambiguation buffer andthe wrap bit. Various realizations of the invention have differentnumbers of entries in the operation scheduling unit and the memorydisambiguation buffer. In addition, the ratio of entries between theoperation scheduling unit and the memory disambiguation unit varies inrealizations of the invention. The different ratios may affect orderinformation in the memory disambiguation buffer in various realizationsof the invention (e.g., multiple wrap bits).

Returning to FIG. 1, various realizations of the invention indicateorder information differently and process the order informationdifferently (e.g., some order information flows into the order qualifierblock 101 while the remaining flows into the memory operation ordercharacteristic block 105, all of the incoming memory operation orderinformation flows into the order qualifier block 101, etc.). FIG. 1includes a dashed line, which may carry incoming memory operation orderinformation, that bypasses the order qualifier block 101 and carries theincoming memory operation order information to the memory operationorder characteristic order indication block 105. For example, the wrapbit from FIG. 2 flows into the order qualifier block 101 while theremaining 5-bits of order information flow into the memory operationorder characteristic indication block 105. The order qualifier block 101either leaves the wrap bit unchanged or complements the value, dependingon the type of the incoming memory operation (e.g., the wrap bit remainsunchanged if the incoming memory operation is a read type memoryoperation and is complemented if the incoming memory operation is awrite type memory operation).

The memory operation type qualifier block 103 receives resident memoryoperation type information and the incoming memory operation typeinformation. The memory operation indication block 103 indicates to thememory operation order characteristic indication block 105 qualifiedresident memory operation types. The memory operation type qualifierblock 103 indicates those resident memory operations of a type ofinterest for the particular incoming memory operation. For example, ifthe incoming memory operation is a read type memory operation, thenresident write type memory operations may be of interest. Alternatively,if the incoming memory operation is a write type memory operation, thenresident read type memory operations may be of interest. Variousrealizations of the invention indicate type information differently(e.g., a vector having 1's set for resident memory operations ofinterest).

The memory operation order characteristic indication block 105 receivesthe incoming memory operation's address representation (e.g., virtualaddress, virtual address tag, part of the virtual address, a hash of aphysical address, etc.), the incoming memory operation's orderinformation, the incoming memory operation's qualified orderinformation, the qualified resident memory operation types, the residentmemory operation's order information, and address representations of theresident memory operations. The following table provides an example ofinformation for a RAW hazard that flows into the memory operation ordercharacteristic indication block 105 to aid in understanding thedescribed invention, but not meant to limit the described invention.

TABLE 1 Information for RAW hazard Pre- Pre-qualified Qualifiedqualified Qualified order order Memop type memop type Memop informationinformation information information write A 000011 000011 1 1 write A000101 000101 1 1 Read B 000111 000111 0 0 write A 001011 001011 1 1read A 100101 100101 0 0

Assume that the first four operations in table 1 were issued in theorder they appear from an operation scheduling unit and that the lastoperation is the incoming memory operation. Also assume that the logicillustrated in FIG. 1 is configured to handle RAW hazards as the defaultdata hazard. The exemplary configured logic in this example marks readtype memory operations with 0's and write type memory operations with1's as each memory operation enters a memory disambiguation buffer. Whenthe incoming memory operation is a read type memory operation, then theorder information and the type information will not be changed by thequalifier blocks 101 and 103.

Table 2 provides an example of information for an overeager read (OER)hazard that flows into the memory operation order characteristicindication block 105. OER hazards includes overeager data hazards forread type operations, which can include OEL hazards.

TABLE 2 Information for OER hazard Pre- Pre-qualified Qualifiedqualified Qualified order order Memop type memop type Memop informationinformation information information read A 001011 001011 0 1 read A010101 010101 0 1 write B 001100 001100 1 0 read A 100101 100101 0 1write A 000111 100111 1 0

As with table 1, assume that the first four operations in table 2 wereissued in the order they appear from an operation scheduling unit andthat the last operation is the incoming memory operation. Also assumethe same logic as assumed for table 1. For the case of an incoming writetype memory operation, the order information and the type informationare qualified by the qualifier block 101 and 103. Assuming orderinformation is implemented as depicted in FIG. 2 with a wrap bit (i.e.,the most significant bit of the order information), the wrap bit iscomplemented for the incoming memory operation. In addition, thequalified memory operation type information for all of the memoryoperations, both resident and incoming, is reverse of the pre-qualifiedinformation. The information and exemplary descriptions for Tables 1 and2 are meant to aid in understanding the described invention and notmeant to limit the described invention. This specific description forqualifying information is for exemplary purposes alone. Variousrealizations of the invention qualify different information with anynumber of techniques (e.g., memory operations of interest may befiltered without masks, the described exemplary logic may be configuredto treat OER hazards as the default data hazard, etc.).

The memory operation order characteristic indication block 105 includeslogic that indicates a particular order characteristic based at least inpart on the received order information. Qualification of orderinformation for one data hazard allows the same logic to be recycled forat least two different data hazards. For example, assume that the memoryoperation order characteristic indication block 105 includes logic thatindicates resident memory operations that are older than an incomingmemory operation (i.e., the default data hazard is a RAW hazard).Referring to FIG. 2, older memory operations have lower orderinformation and younger memory operations have higher order information.Using the order information of FIG. 2 as an example, then the followingexemplary logic indicates those resident memory operations that areolder than the incoming memory operation.

Resident Memop Incoming Memop same wrap order info < order info !samewrap order info > order info

If a resident memory operation with a same wrap bit as an incomingmemory operation has order information that is less than the orderinformation of the incoming memory operation, then the resident memoryoperation will be indicated as older (e.g., a corresponding bit in anage mask will be set to 1).

For an OER scenario, the following exemplary logic indicates memoryoperations younger than an incoming memory operation.

Resident Memop Incoming Memop same wrap order info > order info !samewrap order info < order info

The logic for indicating younger memory operations is the complement ofthe logic for indicating older memory operations. If the wrap bit of anincoming memory operation is complemented, then the logic that indicatesolder memory operations will indicate younger memory operations. Whetherthe memory operation order characteristic indication block 105 includeslogic to indicate older resident memory operations or younger residentmemory operations, qualifying the order information of an incomingmemory operation allows both order characteristics to be indicated withthe same logic. Various realizations of the invention may qualify orderinformation differently (e.g., qualify all of the resident memoryoperations' order information). The memory operation ordercharacteristic indication block 105 indicates the resident memoryoperations with the relevant order characteristic (i.e., older oryounger) and passes the indications to a priority picker.

FIG. 3 depicts a flowchart for indicating an age characteristicaccording to realizations of the invention. At block 301, residentmemory operation information and incoming memory operation informationare received, and a resident memory operation is selected as a currentmemory operation. At block 303, it is determined if the addressrepresentations (e.g., virtual address, part of the virtual address,virtual address tag, hash of physical address, etc.) of the currentresident memory operation and the incoming memory operation match. Forexample, the memory operation order characteristic indication block 105may perform the comparison of address representations. Alternatively,the memory operation order characteristic indication block 105 receivesa vector of information that indicates which resident memory operationshave address representations that match the incoming memory operation'saddress representation. If the address representations do not match,then control flows to block 305. If the address representations match,then control flows to block 311.

At block 305, it is determined if there are more resident memoryoperations. If there are more resident memory operations, then controlflows to block 307. If there are no more resident memory operations,then control flows to block 309.

At block 309, the order characteristic indications for the residentmemory operations are sent.

At block 307, the next resident memory operation becomes the currentresident memory operation. Control flows from block 307 back to block303.

At block 311, it is determined if the wrap bit for the incoming memoryoperation and the current memory operation is the same. If the wrap bitis not the same, then control flows to block 315. If the wrap bit is thesame, then control flows to block 313.

At block 315, it is determined if the current resident's orderinformation is greater than the incoming memory operation. If thecurrent resident memory operation's order information is greater thanthe incoming memory operation's order information, then control flows toblock 317. If the current resident memory operation is less than theincoming memory operation, then control flows to block 319.

At block 317, the current resident memory operation is indicated asolder than the incoming memory operation.

At block 319, the current resident memory operation is indicated asyounger than the incoming memory operation.

At block 313, it is determined if the current resident memoryoperation's order information is less than the incoming memoryoperation's order information. If the current resident memoryoperation's order information is less than the incoming memoryoperation's order information, then control flows to block 317. If thecurrent resident memory operation is greater than the incoming memoryoperation, then control flows to block 319.

While the flow diagram shows a particular order of operations performedby certain realizations of the invention, it should be understood thatsuch order is exemplary (e.g., alternative realizations may perform theoperations in a different order, combine certain operations, overlapcertain operations, perform certain operations in parallel, etc.). Also,it should be understood that in various realizations of the inventionthe decisions performed by the blocks in the flowchart depicted in FIG.2 are implicit. The information flows through logic that operates on theinformation. For example, blocks 305, 307, and 309 may not be performedexplicitly. The information is a vector of information that flowsthrough logic without the logic stepping through each piece ofinformation that corresponds to a different resident memory operation.In addition, block 303 may be performed by a different unit.

Table 3 below provides an example of vectors used to indicate particularresident memory operations to resolve an OER hazard.

TABLE 3 Vectors for OER hazard resolution Order Matching CandidateQualified char Qualified address rep memops Memop order vector vectortype vector vector vector read A 001011 1 1 1 1 read A 010101 1 1 1 1write B 001100 1 0 0 0 read A 100101 0 1 1 0 write A 100111 — — — —

Order characteristic vector in table 3 indicates which resident memoryoperations are younger than the incoming memory operation. As previouslydescribed, the logic indicates resident memory operations that are olderthan an incoming memory operation, but when the order information isqualified for OER hazards (again assuming the logic is configured forRAW hazards instead of for OER) the logic indicates younger residentmemory operations. If all of the vectors are combined, then the onlyremaining candidate resident memory operations are the first two readtype memory operations. The resident write type memory operation doesnot have a matching address representation and is of the wrong type. Thethird resident read type memory operation is not younger than theincoming write type memory operation.

FIGS. 4A-4B depict exemplary priority pickers according to realizationsof the invention. FIG. 4A depicts an exemplary priority picker thatselects a resident memory operation with a particular ordercharacteristic according to realizations of the invention. A prioritypicker 401 receives 4 vectors of information in FIG. 4A. The prioritypicker 401 receives a memops_order_characteristic_indications vector,which indicates those resident memory operations with the ordercharacteristic relevant to the data hazard being handled. For example,the memops_order_characteristic_indications vector indicates eachresident memory operation that is older than an incoming memoryoperation for a RAW data hazard, or indicates each resident memoryoperation that is younger than an incoming memory operation for an OERdata hazard. The priority picker 401 also receives anaddress_rep_matches vector, a memop_types vector, and a bov_matchvector. The address_rep_matches vector indicates those resident memoryoperations with address representations that match the incoming memoryoperation's address representation. The memop_type vector indicatesthose resident memory operations of the relevant memory operation type(e.g., read type memory operations for a RAW hazard, write type memoryoperations for an OER hazard, etc.). The bov_match vector indicatesthose resident memory operations with data that overlaps the datacorresponding to the incoming memory operation. The priority picker 401also receives resident memory operations' order information. Thepriority picker 401 determines which memory operations to consider basedon all of the received vectors.

The combination of the vectors indicates which of the resident memoryoperations have matching address representations, data overlap, and areof the relevant memory operation type with the incoming memory operation(i.e., which resident memory operations are candidates for selection).The order information of the candidate resident memory operations flowthrough the priority picker 401. The priority picker 401 indicates whichone of the candidate resident memory operations satisfies a given ordercriteria, and sends indication of the selected resident memoryoperation. The selected resident memory operation is used for resolvinga data hazard. For example, the priority picker 401 indicates theyoungest of the candidate resident memory operations. The youngestindicated resident memory operation may be indicated for OER coloring,RAW bypass, etc. RAW bypass may be performed with a predictive techniquethat compares address representations instead of complete addresses toefficiently identify a write type memory operation candidate for RAWbypass. Such a technique is described in more detail in commonly owned,co-pending U.S. patent application Ser. No. 10/747,584, filed Dec. 29,2003, 2003, naming as inventors Krishna Thatipelli and BalakrishnaVenkatrao, entitled “Efficient Read After Write Bypass,” which isincorporated herein by reference in its entirety.

FIG. 4B depicts another exemplary priority picker that selects aresident memory operation for data hazard resolution operationsaccording to realizations of the invention. A priority picker 403 issimilar to the priority picker 401 of FIG. 4A. However, unlike thepriority picker 401, the priority picker 403 receives a vector thatalready indicates candidate resident memory operations. For example, thememory operation order characteristic indication block 105 of FIG. 1determines those resident memory operations with address representationsthat match the incoming memory operation, and that are of the relevantmemory operation type. It is assumed that other units have determineddata overlap and AND'd the vectors together to generate the indicationsof candidate resident memory operations. The priority picker 403 selectsone of the candidate resident memory operations and indicates theselected resident memory operation to one or more other units for datahazard resolution.

Utilizing a single priority picker in addition to recycling logic fordifferent data hazards significantly reduces processor area consumed fordata hazard resolution. A design that recycles the logic as previouslydescribed and utilizes a single priority picker maintains processorperformance with data hazard resolution while releasing space on theprocessor for other logic and/or memory.

FIG. 5 depicts OER data hazard resolution operations according torealizations of the invention. In a first cycle, an incoming write typememory operation is received. Comparison of address representations isperformed between the incoming write type memory operation and residentmemory operations. Also, byte overlap check is performed between theincoming write type memory operation and the resident memory operations.In a second cycle, a youngest of resident read type memory operations,which are younger than the incoming write type memory operation, with anaddress representation that matches the incoming write type memoryoperation's address representation and with data that overlaps theincoming write type memory operation is selected for coloring. Forexample, the priority picker 401 or 403 selects a youngest read typememory operation from a group of resident read type memory operationsthat are younger than the incoming write type memory operation. In athird cycle, the address of the incoming write type memory operation isreceived. In a fourth cycle, those younger resident read type memoryoperations with addresses matching the received write type memoryoperation's address are selected. In a fifth cycle, rewind logicoperates based at least in part on the read type memory operationsselected in the fourth cycle. In a sixth cycle, a rewind signal istransmitted. The number of cycles depicted in FIG. 6 are meant to aid inunderstanding the described invention and not mean to be limiting uponthe described invention. It should be understood that the number ofcycles to perform operations may vary between different platforms, asplatforms evolve, as instruction sets change, etc.

FIG. 6 depicts coloring of memory operations for OER data hazardresolution according to realizations of the invention. A memorydisambiguation buffer (MDB) 605 sends a coloring signal indicatingidentifiers for corresponding memory operations. For example, the MDB605 indicates the incoming write type memory operation and the selectedresident read type memory operation as in cycle 2 of FIG. 5. The MDB 605sends the coloring signal to an operation fetch unit 601 and anoperation renaming unit 603. The operation fetch unit 601 locates theindicated memory operations in an operation register 602 (e.g., mappinginto the corresponding entries in the operation register 602 with theoperation identifiers) and sets their coloring bits. The operation fetchunit 601 fetches operations from the operation register 602 and passesthem to the operation renaming unit 603. The operation renaming unit 603imposes a data dependency on the indicated operations and causes anoperation scheduling unit 609 to issue operations in accordance with thecoloring (e.g., indicating to the operation scheduling unit when a writetype memory operation has completed). The operation scheduling unit 609issues memory operations in accordance with the imposed datadependencies to the MDB 605.

In addition to reducing valuable processor area, utilizing a singlepriority picker improves processor performance. Implementing coloringwithout the described techniques for utilizing a single priority pickerwould call for extra pipe-line stages on the store path. Hence, storeswould effectively have taken longer time to complete from issue toretire with a coloring scheme based on multiple priority pickers.However, with logic that allows a single priority picker to be utilizedthe benefits of coloring can be reaped without the cost of additionalpipeline stages.

FIG. 7 depicts selection of read type memory operations for OER rewindaccording to realizations of the invention. In FIG. 7, an orderindication logic 701, similar to the order characteristic indicationblock 105 of FIG. 5, receives order indications and memory operationtype indications for both resident memory operations and an incomingmemory operation. The order indication logic 701 indicates youngerresident read memory operations based at least in part on the receivedindications. A block 703 is a content addressable memory (CAM) ofresident memory operation addresses and compare logic. The block 703compares the incoming memory operation's address against all of theresident memory operations' addresses. A CAM is depicted forillustrative purposes alone. Various realizations of the inventionimplement different mechanisms for storing resident memory operations'addresses and comparing them against an incoming memory operation'saddress. The block 703 indicates those resident memory operations withaddresses that match the incoming memory operation's address. Theindication from the block 703 and the indication from the orderindication logic 701 flow into an AND gate 705. From the AND gate flowsindications of those resident read type memory operations with datadependencies on the incoming write type memory operation (i.e., OERhazards).

FIG. 8 depicts an exemplary memory disambiguation buffer communicatingrewind to other units according to realizations of the invention. Anoperation fetch unit 805 fetches operations from an operation register811 and passes them to an operation scheduling unit 803. The operationscheduling unit 803 passes memory operations, their order information,and their identifiers to an MDB 801. The MDB 801 detects a possibleovereager read data hazard and sends an overeager read coloring signalto the operation fetch unit 805 and an operation renaming unit 807,similar to FIG. 6. After determining the incoming write type memoryoperation's address (e.g., from a data translation lookahead buffer),the MDB 801 determines overeager read hazards and generates a rewindsignal. The MDB 801 sends the rewind signal along with indications ofthe relevant read type memory operations to a memory scheduling window809. The memory scheduling window 809 flushes the indicated read typememory operations from its buffers, and may also drop requests from anyof the read type memory operations being flushed. The operationscheduling unit 803 reissues the indicated read type memory operationsin accordance with coloring bits. If there are multiple read type memoryoperations that have data dependency with the incoming write type memoryoperation, then after a few iterations these read type memory operationswill also have their coloring bits set.

FIG. 9 depicts exemplary sub-blocks of a memory disambiguation bufferaccording to realizations of the invention. A memory disambiguationbuffer 931 includes a data check-address representation match sub-block921, an instruction identifier sub-block 911, an address sub-block 903,and a priority picker sub-block 943. The data check-addressrepresentation match sub-block 921 includes a data enable array 923, adata overlap logic 925, an address representation array 927, and anaddress representation logic 929. The data enable array 923 includesentries for each resident memory operation. Each of the entriesindicates the amount of data enabled for the corresponding residentmemory operation. The data overlap logic 925 determines data overlapbased on these entries. The address representation array 927 (e.g., asumming content addressable memory) hosts address representations of thetarget memory locations for each of the resident memory operations. Theaddress representation match logic 929 determines which of the residentmemory operations' address representations match the incoming memoryoperation's address representation.

The address sub-block 903 includes an overeager read (OER) maskgenerator 905, OER rewind logic 907, and an address array and comparelogic 909. The OER mask generator 905 indicates those resident read typememory operations that are younger than the incoming write type memoryoperation. The address array and compare logic 909 (e.g., a contentaddressable memory) hosts the addresses of resident memory operationsand compares them against the incoming memory operation's address. TheOER rewind logic 907 generates a rewind signal depending on theinformation from the address array and compare logic 909 and the OERmask generator 905.

The instruction identifier sub-block includes coloring logic 917 and anoperation identifier array 913. The coloring logic 917 receives one ormore inputs from the data check-address representation match sub-block921 and the priority picker sub-block 943 that indicate which of theresident read type memory operations have been selected for coloring.The coloring logic 917 looks up in the operation identifier array 913the operation identifier that corresponds to the selected memoryoperation.

The sub-blocks depicted in FIG. 9 are exemplary. In addition, the memorydisambiguation buffer may include additional sub-blocks, and thedepicted sub-blocks may include additional or fewer logic, both of whichare not illustrated to avoid obfuscating the described invention.

The described invention may be provided as a computer program product,or software, that may include a machine-readable medium having storedthereon instructions, which may be used to program a computer system (orother electronic devices) to perform a process according to the presentinvention. A machine readable medium includes any mechanism for storingor transmitting information in a form (e.g., software, processingapplication) readable by a machine (e.g., a computer). Themachine-readable medium may include, but is not limited to, magneticstorage medium (e.g., floppy diskette); optical storage medium (e.g.,CD-ROM); magneto-optical storage medium; read only memory (ROM); randomaccess memory (RAM); erasable programmable memory (e.g., EPROM andEEPROM); flash memory; electrical, optical, acoustical or other form ofpropagated signal (e.g., carrier waves, infrared signals, digitalsignals, etc.); or other types of medium suitable for storing electronicinstructions.

FIG. 10 depicts an exemplary computer system according to realizationsof the invention. A computer system 1000 includes a processor unit 1001(possibly including multiple processors). The processor unit 1001includes recyclable logic for at least two different type of datahazards and a single priority picker for data hazard resolution. Forexample, the processor unit 1001 includes the memory disambiguationbuffer depicted in FIG. 9, the data hazard resolution unit depicted inFIG. 1, etc. The computer system 1000 also includes a system memory1007A-1007F (e.g., one or more of cache, SRAM DRAM, RDRAM, EDO RAM, DDRRAM, EEPROM, etc.), a system bus 1003 (e.g., LDT, PCI, ISA, etc.), anetwork interface 1005 (e.g., an ATM interface, an Ethernet interface, aFrame Relay interface, etc.), and a storage device(s) 1009A-1009D (e.g.,optical storage, magnetic storage, etc.). Realizations of the inventionmay include fewer or additional components not illustrated in FIG. 10(e.g., video cards, audio cards, additional network interfaces,peripheral devices, etc.). The processor unit 1001, the storagedevice(s) 1009A-1009D, the network interface 1005, and the system memory1007A-1007F are coupled to the system bus 1003. Although FIG. 10illustrates the processor unit 1001 as including the branch predictionstructure, various realizations of the invention implement the branchprediction structure differently (e.g., storage separate from theprocessor, storage in a co-processor, etc.).

While the invention has been described with reference to variousrealizations, it will be understood that these realizations areillustrative and that the scope of the invention is not limited to them.Many variations, modifications, additions, and improvements arepossible. More generally, realizations in accordance with the presentinvention have been described in the context of particular realizations.For example, the blocks and logic units identified in the descriptionare for understanding the described invention and not meant to limit thedescribed invention. Functionality may be separated or combined inblocks differently in various realizations of the invention or describedwith different terminology. For example, an operation fetch unit may bereferred to as an instruction fetch unit, an instruction buffer mayperform some or all of the functionality of the operation fetch unit,the operation scheduling unit, and/or the renaming unit, the memorydisambiguation buffer may be referred to as a data hazard resolutionunit, the memory disambiguation buffer may include a data hazardresolution unit, etc.

These realizations are meant to be illustrative and not limiting.Accordingly, plural instances may be provided for components describedherein as a single instance. Boundaries between various components,operations and data stores are somewhat arbitrary, and particularoperations are illustrated in the context of specific illustrativeconfigurations. Other allocations of functionality are envisioned andmay fall within the scope of claims that follow. Finally, structures andfunctionality presented as discrete components in the exemplaryconfigurations may be implemented as a combined structure or component.These and other variations, modifications, additions, and improvementsmay fall within the scope of the invention as defined in the claims thatfollow.

1. A data hazard resolution unit that resolves at least two differentdata hazards between resident memory operations and incoming memoryoperations with a set of logic that indicates order of the residentmemory operations relative to the incoming memory operations, whereinthe indicated order corresponds to the data hazard being resolved, andthat includes a priority picker to select one of the indicated residentmemory operations for either data hazard, wherein the set of logicmodifies order information of those incoming memory operations thatcorrespond to a first of the data hazards, but does not modify orderinformation of those incoming memory operations that correspond to asecond of the data hazards.
 2. The data hazard resolution unit of claim1 wherein the set of logic performs one of at least two comparisonoperations based at least in part on the order information.
 3. The datahazard resolution unit of claim 2 wherein the set of logic indicates theorder as those resident memory operations that are younger than theincoming memory operations if the order information is not modified andindicates the order as those resident memory operations that are olderthan one of the incoming memory operations if the order information ismodified.
 4. The data hazard resolution unit of claim 3 wherein the setof logic makes a greater than comparison between the resident memoryoperations and those incoming memory operations with order informationthat indicates a first value and makes a less than comparison betweenthe resident memory operations and those incoming memory operations withorder information that indicates a second value.
 5. The data hazardresolution unit of claim 4 wherein the order information includes one ormore wrapping bits.
 6. The data hazard resolution unit of claim 1wherein the data hazards include read-after-write and overeager-read. 7.The data hazard resolution unit of claim 1 wherein the priority pickerpicks a youngest of the indicated resident memory operations.
 8. Thedata hazard resolution unit of claim 7 that bypasses data from theyoungest of the indicated resident memory operations to one of theincoming memory operations, wherein the resident memory operations arewrite type memory operations that are older than the one of the incomingmemory operation, which is a read type memory operation.
 9. The datahazard resolution unit of claim 7 that causes a dependency to be imposedbetween the youngest of the indicated resident memory operations and oneof the incoming memory operations, wherein the indicated resident memoryoperations are read type memory operations that are younger than the oneof the incoming memory operations, which is a read type memoryoperation.
 10. The data hazard resolution unit of claim 9 wherein saidcauses the dependency to be imposed includes causing information thatcorresponds to the youngest of the indicated resident memory operationsand a corresponding one of the incoming memory operations to indicatethe dependency.
 11. The data hazard resolution unit of claim 9 whereincauses imposition of the data dependency comprises generating a coloringsignal and indicating the youngest read type memory operation and thecorresponding write type memory operation.
 12. The data hazardresolution unit of claim 1 that compares addresses of each of theincoming memory operations against addresses of the resident memoryoperations and causes those indicated resident memory operations thatcorrespond to an overeager-read data hazard to be reissued.
 13. The datahazard resolution unit of claim 12 wherein causes reissue comprisesgenerating a rewind signal, which causes flushing of the indicatedresident memory operations for reissue.
 14. The data hazard resolutionunit of claim 1, wherein the data hazard resolution unit includes amemory disambiguation buffer or a load store queue.
 15. A methodcomprising: determining if an incoming memory operation is of a first orsecond type of memory operation; modifying order information of theincoming memory operation if the incoming memory operation is of a firsttype of memory operation, but not if the incoming memory operation is ofa first type of memory operation; comparing the incoming memoryoperation and the resident memory operations based at least in part onthe order information of the incoming memory operation; indicating thoseof the resident memory operations with a first order characteristicbased at least on the comparing; and selecting one of the indicatedresident memory operations with a second order characteristic relativeto the other indicated resident memory operations.
 16. The method ofclaim 15 wherein the first type of memory operation includes write typememory operations and the second type of memory operations include readtype memory operations.
 17. The method of claim 15 wherein the orderinformation is an index value that indicates order of memory operationsin a data hazard resolution unit.
 18. The method of claim 15 wherein thefirst order characteristic includes the indicated resident memoryoperations being younger than the incoming memory operation.
 19. Themethod of claim 18 wherein the indicated resident memory operationsinclude candidate resident memory operations for overeager-read rewind.20. The method of claim 15 wherein the first order characteristicincludes the indicated resident memory operations being older than theincoming memory operation.
 21. The method of claim 20 wherein data ofthe selected one of the indicated resident memory operations with thesecond order characteristic is bypassed to the incoming memoryoperation.
 22. The method of claim 15 wherein the second ordercharacteristic includes the selected indicated resident memory operationbeing younger than the other indicated resident memory operations. 23.The method of claim 15 embodied as a computer-readable storage mediumencoded with instructions that, when executed by a computer, cause thecomputer to perform the method.
 24. A method comprising: performing datahazard resolution operations for at least two different data hazardswith a set of logic that modifies order information of incoming memoryoperations of a first type of memory operation, but does not modifyorder information of incoming memory operations of a second type ofmemory operation, and determines resident memory operations with a firstorder characteristic relative to the incoming memory operations based atleast in part on the order information.
 25. The method of claim 24wherein data hazard resolution operations comprise determining which ofthe resident memory operations have address representations that matchan incoming memory operation's address representation.
 26. The method ofclaim 24 further comprising indicating one of the resident memoryoperation with the first order characteristic to have a second ordercharacteristic with respect to the other resident memory operation withthe first order characteristic.
 27. The method of claim 26 wherein thesecond order characteristic includes the indicated one of the residentmemory operations being the youngest of the resident memory operationswith the first order characteristic.
 28. The method of claim 27 whereinthe youngest of the resident memory operations and an incoming memoryoperation are marked to indicate their data dependency.
 29. The methodof claim 27 wherein data of the youngest of the resident memoryoperations is bypassed to an incoming memory operation.
 30. The methodof claim 27 wherein the first order characteristic includes the residentmemory operations being older or younger than the incoming memoryoperation.
 31. The method of claim 24 wherein the order informationincludes one or more wrapping bits and wherein said modifies the orderinformation includes complementing the wrapping bits.
 32. The method ofclaim 24 wherein the first type of memory operation includes write typememory operations and a second type of memory operation includes readtype memory operations.
 33. The method of claim 24 wherein the at leasttwo different data hazards include read after write hazards and writeafter read hazards.
 34. The method of claim 24 wherein the data hazardresolution operations include read after write bypass and overeager readcoloring.
 35. The method of claim 24 embodied as a computer-readablestorage medium encoded with instructions that, when executed by acomputer, cause the computer to perform the method.
 36. An apparatuscomprising: an order information modifying logic to modify orderinformation of an incoming memory operation if the memory operation is afirst of at least two different types of memory operations, but not ifthe memory operation is a second of the at least two different types ofmemory operations; a memory operation type indication logic to indicatememory operation types; a memory operations comparison logic coupledwith the memory operation type indication logic and the orderinformation modifying logic, the memory operations comparison logic tocompare the incoming memory operation against resident memory operationsbased at least in part on the incoming memory operation's orderinformation and the resident memory operations' order information, andto indicate a first order characteristic of the resident memoryoperations relative to the incoming memory operation.
 37. The apparatusof claim 36 further comprising a fourth logic to indicate one of theresident memory operations as having a second order characteristicrelative to the other resident memory operations.
 38. The apparatus ofclaim 37 further comprising a fifth logic to perform read after writebypass with a resident memory operation indicated by the fourth logic.39. The apparatus of claim 37 further comprising a fifth logic to causedata dependency to be imposed on an incoming memory operation and aresident memory operation indicated by the fourth logic.
 40. Theapparatus of claim 39 wherein the data dependency is imposed with acoloring mechanism.
 41. The apparatus of claim 37 wherein the secondorder characteristic includes the indicated one as being the youngest.42. The apparatus of claim 41 wherein the first order characteristicincludes the resident memory operations being younger than the incomingmemory operation.
 43. The apparatus of claim 36 wherein the twodifferent data hazards include a read after write hazard and a writeafter read hazard.
 44. The apparatus of claim 36 further comprising: afourth logic to compare addresses of the incoming memory operation andthe resident memory operations and to indicate those resident memoryoperations with addresses matching the incoming memory operation'saddress; and a fifth logic to cause rewind of those resident memoryoperations indicated by both the third and fourth logic.
 45. A datahazard resolution unit comprising: an resident memory operation addressarray block to host addresses of resident memory operations; a memoryoperations compare block to compare address representations of residentand incoming memory operations, to determine data overlap betweenresident and incoming memory operations, to modify order information ofincoming memory operations of a first type, but not of a second type,and to indicate two different order characteristics of resident memoryoperations relative to an incoming memory operation for at least twodifferent data hazards based at least in part on the order information;and a priority picker block to compare memory operations and indicateone resident memory operation with a second order characteristicrelative to other resident memory operations.
 46. The data hazardresolution unit of claim 45 further comprising an operation identifierblock to host identifiers of resident memory operations.
 47. The datahazard resolution unit of claim 46 further comprising the operationidentifier block to determine operation identifiers of resident memoryoperations indicated by priority picker, the incoming memory operation,and resident memory operations indicated by the memory operationscompare block.
 48. The data hazard resolution unit of claim 47 toprovide the operation identifiers of resident memory operationsindicated by the memory operations compare block and the incoming memoryoperation for overeager read rewind.
 49. The memory disambiguationbuffer of claim 47 to provide the operation identifier of a residentmemory operation indicated by the priority picker and the operationidentifier of the incoming memory operation for overeager read coloring.50. The data hazard resolution unit of claim 45 further comprising theaddress array block to compare hosted addresses against the incomingmemory operation's address and to cause overeager read rewind for thoseresident memory operations with addresses matching the incoming memoryoperation's address and indicated by the memory operations compareblock.
 51. The data hazard resolution unit of claim 45 wherein thememory operations compare block includes a summing content addressablememory to compare address representations.
 52. The data hazardresolution unit of claim 45 wherein the address array block includes acontent addressable memory.
 53. The data hazard resolution unit of claim45 wherein the memory operations compare block to modify orderinformation includes the memory operations compare block to complementat least part of the order information.
 54. The data hazard resolutionunit of claim 45 wherein the two different order characteristics includeyounger and older.
 55. The data hazard resolution unit of claim 45wherein the at least two different data hazards include an overeagerread hazard and a read after write hazard.
 56. A processor comprising: amemory operation register to host memory operations and to set coloringinformation that indicates data dependency between memory operations;and a data hazard resolution unit coupled with the memory operationregister to modify order information of incoming memory operations thatcorrespond to a first of at least two different data hazards, but notthat correspond to a second of at least two different data hazards, andto indicate memory operations based at least in part on the orderinformation to the memory operation register for setting of coloringinformation.
 57. The processor of claim 56 further comprising a datacache unit and the data hazard resolution unit to indicate memoryoperations to the data cache unit for read after write bypass.
 58. Theprocessor of claim 56 further comprising: a memory operation renamingunit to impose data dependencies based at least in part on the coloringinformation of memory operations; and the data hazard resolution unitcoupled with the memory operation renaming unit to indicate to thememory operation unit memory operations for coloring.
 59. The processorof claim 56 wherein the at least two different data hazards include aread after write hazard and an overeager read hazard.
 60. The processorof claim 56 further comprising: an operation scheduling unit to issueoperations in accordance with imposed data dependencies; and the memoryoperation renaming unit to impose data dependencies on the operationscheduling unit.
 61. The processor of claim 60 wherein the orderinformation indicates indexing information that corresponds to order ofoperations in the operation scheduling unit.
 62. The processor of claim56, wherein the data hazard resolution unit includes a load store queueor a memory disambiguation buffer.
 63. The processor of claim 56,wherein the processor includes multiple cores.
 64. An apparatuscomprising: a queue for memory operations; and means for modifying orderinformation of memory operations that correspond to a first of at leasttwo different data hazards, but not that correspond to a first of atleast two different data hazards, and for indicating resident memoryoperations with a first order characteristic relative to an incomingmemory operation based at least in part on the order information toresolve the corresponding data hazard.
 65. The apparatus of claim 64further comprising means for generating a rewind signal to rewind readtype memory operations corresponding to a detected overeager readhazard.
 66. The apparatus of claim 64 means for bypassing data from aresident write type memory operation to an incoming read type memoryoperation.
 67. The apparatus of claim 64 means for indicating a priorityresident memory operation with a second order characteristic relative toother resident memory operations for overeager read coloring or for readafter write bypass.